-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix SplitAssignment filtering #15933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SplitAssignment filtering #15933
Conversation
JunhyungSong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this not to reschedule when numSplits <= guaranteedSplitsPerRequest already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, or any other situation where the new batch size would still allow sending the number of splits that we have for the current request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be a bug(other than numSplits <= guaranteedSplitsPerRequest). Because if requestSize > maxRequestSizeInBytes and newSplitBatchSize < currentSplitBatchSize, it should be numSplits <= guaranteedSplitsPerRequest || numSplits > newSplitBatchSize. Otherwise, that would mean that the split batch size decreasing logic is not working as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you're correct, but it seems more clear to make the check explicit and also handle numSplits < guaranteedSplitsPerRequest no? I'm happy to just drop this commit since the first one is already merged with the bugfix too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is useful. My thought is that it would be better to have something like checkState(numSplits > newSplitBatchSize || numSplits <= guaranteedSplitsPerRequest, "numSplits is not adjusted"); inside if (newSplitBatchSize < currentSplitBatchSize) { ... }.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, updated with that additional check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we were wrong about that numSplits assumption: the new batch size can be smaller than the old batch size even though numSplits <= newSplitBatchSize. I've pushed a new revision which updates the check to only reschedule when numSplits > newSplitBatchSize && requestSize > maxRequestSizeInBytes since that's logically the intention here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the new batch size can be smaller than the old batch size even though numSplits <= newSplitBatchSize.
I see that this can happen when requestSize < expectedSize && currentSplitBatchSize < maxUnacknowledgedSplits and currentSplitBatchSize > newSplitBatchSize. Good catch.
|
It looks like the second commit requires more discussion while the first one is ready to go. I'm going to pull the first commit in separately to unblock the release. |
|
First commit merged, this should unblock the release. Thanks @pettyjamesm , @JunhyungSong |
9da72e4 to
8a2698d
Compare
8a2698d to
eb7a68c
Compare
JunhyungSong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Fixes 15931 which was introduced in #15721 and could cause split assignment updates that are empty after filtering previously acknowledged splits to trigger a sanity check validation in
DriverSplitRunnerFactory#enqueueSplitsand fail.This change should not require release notes because the issue being fixed has not yet shipped with a release.
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: